Abstract: The widespread use of computers and the advent of the internet has made it easier to plagiarize the work of others. Most cases of plagiarism are found in academia where documents are typically essays or reports. Detection of plagiarism can be manual or software assisted. Software assisted detection and analysis allows vast collections of documents to be compared to each other making accurate and successful detection.Document clustering is the application of cluster analysis to textual documents. It has applications in the automatic document organization, topic extraction and fast information retrieval. In technical publishing authorship of a work are claimed by those making intellectual contributions to the completion of the research described in the work. Analysis of this work is termed as authorship analysis.
Keywords: Clustering, Author identification, k-means.